Nearest Cluster Classifier

نویسندگان

  • Hamid Parvin
  • Moslem Mohamadi
  • Sajad Parvin
  • Zahra Rezaei
  • Behrouz Minaei-Bidgoli
چکیده

In this paper, a new classification method that uses a clustering method to reduce the train set of K-Nearest Neighbor (KNN) classifier and also in order to enhance its performance is proposed. The proposed method is called Nearest Cluster Classifier (NCC). Inspiring the traditional K-NN algorithm, the main idea is to classify a test sample according to the tag of its nearest neighbor. First, the train set is clustered into a number of partitions. By obtaining a number of partitions employing several runnings of a simple clustering algorithm, NCC algorithm extracts a large number of clusters out of the partitions. Then, the label of each cluster center produced in the previous step is determined employing the majority vote mechanism between the class labels of the patterns in the cluster. The NCC algorithm iteratively adds a cluster to a pool of the selected clusters that are considered as the train set of the final 1-NN classifier as long as the 1-NN classifier performance over a set of patterns included the train set and the validation set improves. The selected set of the most accurate clusters are considered as the train set of final 1-NN classifier. After that, the class label of a new test sample is determined according to the class label of the nearest cluster center. Computationally, the NCC is about K times faster than KNN. The proposed method is evaluated on some real datasets from UCI repository. Empirical studies show an excellent improvement in terms of both accuracy and time complexity in comparison with KNN classifier.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzy Nearest Prototype Classifier Applied to Speaker Identification

In a vector quantisation (VQ) based speaker identification system, a speaker model is created for each speaker from the training speech data by using the k-means clustering algorithm. For an unknown utterance analysed into a sequence of vectors, the nearest prototype classifier is used to identify speaker. To achieve the higher speaker identification accuracy, a fuzzy approach is proposed in th...

متن کامل

Hilbert Space Filling Curve (hsfc) Nearest Neighbor Classifier

The Nearest Neighbor algorithm is one of the simplest and oldest classification techniques. A given collection of historic data (Training Data) of known classification is stored in memory. Then based on the stored knowledge the classification of an unknown data (Test Data) is predicted by finding the classification of the nearest neighbor. For example, if an instance from the test set is presen...

متن کامل

A Review of Cluster Based Classification Technique

Fusion and ensemble is important technique of machine learning. Fusion fused the feature attribute of different classifier and improved the classification of binary classifier. Instead of that ensemble technique provide the facility of merge two individual classifier and improve the performance of both classifiers. The ensemble technique of classifier depends on number of nearer point of classi...

متن کامل

Word Spotting in Scanned Tamil Land Documents using K-Nearest Neighbor

word spotting is a technique which can extract the text from input image. Here, we implemented on scanned Tamil land documents. Using Gabor feature, we extract the feature values for the input image. The main goal is recognize the text from the document using K nearest neighbor classifier. The features were calculated and the features were combined. Using these features, we can classify and rec...

متن کامل

Unsupervised Learning of Prototypes and Attribute Weights Summary

In this paper, we introduce new algorithms that perform clustering and feature weighting simultaneously and in an unsupervised manner. The proposed algorithms are computationally and implementation ally simple, and learn a different set of feature weights for each identified cluster. The cluster dependent feature weights offer two advantages. First, they guide the clustering process to partitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012